Movie reviews: do words add up to a sentiment?
نویسندگان
چکیده
Sentiment analysis, the automatic extraction of opinion from text, has been enjoying some attention in the media during the national elections. In this thesis, we will discuss the classification of movie reviews as ’thumbs up’ or ’thumbs down’. Movie reviews are interesting and difficult because of the wide range of topics in movies. The reviews are HTML web pages, which poses an interesting challenge for preprocessing and noise removal. We describe the reviews as ’bags of words’ and use support vector machines (SVMs) for classification, as well as transductive support vector machines, which require less training data. To model topics in the reviews, a latent semantic analysis (LSA) was done on a large set of movie reviews. The results show that it is hard to improve SVM performance with latent semantic analysis. The discussion of the results provide some insights into why no performance increase was achieved.
منابع مشابه
MORE SENSE: MOvie REviews SENtiment analysis boosted with SEmantics
Sentiment analysis is becoming one of the most active area in Natural Language Processing nowadays. Its importance coincides with the growth of social media and the open space they create for expressing opinions and emotions via reviews, forum discussions, microblogs, Twitter and social networks. Most of the existing approaches on sentiment analysis rely mainly on the presence of affect words t...
متن کاملSentiment Classification and Feature based Summarization of Movie Reviews in Mobile Environment
A new framework is designed for sentiment classification and feature based summarization system in a mobile environment. Posting online reviews has become an increasingly popular way for people to share their opinions about specific product or service with other users. It has become a common practice for web technologies to provide the venues and facilities for people to publish their reviews. ...
متن کاملImproving Document-Level Sentiment Classification Using Contextual Valence Shifters
Traditional sentiment feature extraction methods in documentlevel sentiment classification either count the frequencies of sentiment words as features, or the frequencies of modified and unmodified instances of each of these words. However, these methods do not represent the sentiment words’ linguistic context efficiently. We propose a novel method and feature set to handle the contextual polar...
متن کاملSentiment Analysis of IMDb movie reviews
There are hundreds of newspaper articles, blogs, magazines and product reviews that get released on the web everyday. The New York Times has a database of newspapers spanning over 20 years between 1987 and 2007 that is available online. The online database also contains 1.8 million articles from The Times, and many of these online articles have been manually annotated for people, places and org...
متن کاملSentiment Classification for Movie Reviews in Chinese Using Parsing-based Methods
Sentiment classification is able to help people automatically analyze customers’ opinions from the large corpus. In this paper, we collect some Chinese movie reviews from Bulletin Board System and aim at making sentiment classification so as to extract several frequent opinion words in some movie elements such as plots, actors/actresses, special effects, and so on. Moreover, we result in a gene...
متن کامل